Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion

Identifieur interne : 000771 ( Main/Exploration ); précédent : 000770; suivant : 000772

A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion

Auteurs : Othman Lachhab [Maroc] ; Joseph Di Martino [France] ; Elhassane Ibn Elhaj [Maroc] ; Ahmed Hammouch [Maroc]

Source :

RBID : PMC:4627987

English descriptors

Abstract

In this paper, we propose a hybrid system based on a modified statistical GMM voice conversion algorithm for improving the recognition of esophageal speech. This hybrid system aims to compensate for the distorted information present in the esophageal acoustic features by using a voice conversion method. The esophageal speech is converted into a “target” laryngeal speech using an iterative statistical estimation of a transformation function. We did not apply a speech synthesizer for reconstructing the converted speech signal, given that the converted Mel cepstral vectors are used directly as input of our speech recognition system. Furthermore the feature vectors are linearly transformed by the HLDA (heteroscedastic linear discriminant analysis) method to reduce their size in a smaller space having good discriminative properties. The experimental results demonstrate that our proposed system provides an improvement of the phone recognition accuracy with an absolute increase of 3.40 % when compared with the phone recognition accuracy obtained with neither HLDA nor voice conversion.


Url:
DOI: 10.1186/s40064-015-1428-2
PubMed: 26543778
PubMed Central: 4627987


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion</title>
<author>
<name sortKey="Lachhab, Othman" sort="Lachhab, Othman" uniqKey="Lachhab O" first="Othman" last="Lachhab">Othman Lachhab</name>
<affiliation wicri:level="3">
<nlm:aff id="Aff1">LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName>
<settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Di Martino, Joseph" sort="Di Martino, Joseph" uniqKey="Di Martino J" first="Joseph" last="Di Martino">Joseph Di Martino</name>
<affiliation wicri:level="3">
<nlm:aff id="Aff2">LORIA, B.P. 239, Vandœuvre-lès-Nancy, 54506 France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA, B.P. 239, Vandœuvre-lès-Nancy</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
<settlement type="city" wicri:auto="agglo">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Elhaj, Elhassane Ibn" sort="Elhaj, Elhassane Ibn" uniqKey="Elhaj E" first="Elhassane Ibn" last="Elhaj">Elhassane Ibn Elhaj</name>
<affiliation wicri:level="3">
<nlm:aff id="Aff3">INPT, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>INPT, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName>
<settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Hammouch, Ahmed" sort="Hammouch, Ahmed" uniqKey="Hammouch A" first="Ahmed" last="Hammouch">Ahmed Hammouch</name>
<affiliation wicri:level="3">
<nlm:aff id="Aff1">LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName>
<settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">26543778</idno>
<idno type="pmc">4627987</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4627987</idno>
<idno type="RBID">PMC:4627987</idno>
<idno type="doi">10.1186/s40064-015-1428-2</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000020</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000020</idno>
<idno type="wicri:Area/Pmc/Curation">000020</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000020</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000016</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000016</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="wicri:Area/PubMed/Corpus">000031</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000031</idno>
<idno type="wicri:Area/PubMed/Curation">000031</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000031</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000061</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000061</idno>
<idno type="wicri:Area/Ncbi/Merge">000213</idno>
<idno type="wicri:Area/Ncbi/Curation">000206</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000206</idno>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-01221503</idno>
<idno type="url">https://hal.inria.fr/hal-01221503</idno>
<idno type="wicri:Area/Hal/Corpus">000828</idno>
<idno type="wicri:Area/Hal/Curation">000828</idno>
<idno type="wicri:Area/Hal/Checkpoint">000239</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">000239</idno>
<idno type="wicri:doubleKey">2193-1801:2015:Lachhab O:a:preliminary:study</idno>
<idno type="wicri:Area/Main/Merge">000760</idno>
<idno type="wicri:Area/Main/Curation">000771</idno>
<idno type="wicri:Area/Main/Exploration">000771</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion</title>
<author>
<name sortKey="Lachhab, Othman" sort="Lachhab, Othman" uniqKey="Lachhab O" first="Othman" last="Lachhab">Othman Lachhab</name>
<affiliation wicri:level="3">
<nlm:aff id="Aff1">LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName>
<settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Di Martino, Joseph" sort="Di Martino, Joseph" uniqKey="Di Martino J" first="Joseph" last="Di Martino">Joseph Di Martino</name>
<affiliation wicri:level="3">
<nlm:aff id="Aff2">LORIA, B.P. 239, Vandœuvre-lès-Nancy, 54506 France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA, B.P. 239, Vandœuvre-lès-Nancy</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
<settlement type="city" wicri:auto="agglo">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Elhaj, Elhassane Ibn" sort="Elhaj, Elhassane Ibn" uniqKey="Elhaj E" first="Elhassane Ibn" last="Elhaj">Elhassane Ibn Elhaj</name>
<affiliation wicri:level="3">
<nlm:aff id="Aff3">INPT, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>INPT, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName>
<settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Hammouch, Ahmed" sort="Hammouch, Ahmed" uniqKey="Hammouch A" first="Ahmed" last="Hammouch">Ahmed Hammouch</name>
<affiliation wicri:level="3">
<nlm:aff id="Aff1">LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat, Morocco</nlm:aff>
<country xml:lang="fr">Maroc</country>
<wicri:regionArea>LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat</wicri:regionArea>
<placeName>
<settlement type="city">Rabat</settlement>
<region nuts="2">Rabat-Salé-Kénitra</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">SpringerPlus</title>
<idno type="eISSN">2193-1801</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="mix" xml:lang="en">
<term>Automatic speech recognition (ASR)</term>
<term>Esophageal speech assessment</term>
<term>Pathological voices</term>
<term>Speech enhancement</term>
<term>Voice conversion</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>In this paper, we propose a hybrid system based on a modified statistical GMM voice conversion algorithm for improving the recognition of esophageal speech. This hybrid system aims to compensate for the distorted information present in the esophageal acoustic features by using a voice conversion method. The esophageal speech is converted into a “target” laryngeal speech using an iterative statistical estimation of a transformation function. We did not apply a speech synthesizer for reconstructing the converted speech signal, given that the converted Mel cepstral vectors are used directly as input of our speech recognition system. Furthermore the feature vectors are linearly transformed by the HLDA (heteroscedastic linear discriminant analysis) method to reduce their size in a smaller space having good discriminative properties. The experimental results demonstrate that our proposed system provides an improvement of the phone recognition accuracy with an absolute increase of 3.40 % when compared with the phone recognition accuracy obtained with neither HLDA nor voice conversion.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Boll, Sf" uniqKey="Boll S">SF Boll</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Doi, D" uniqKey="Doi D">D Doi</name>
</author>
<author>
<name sortKey="Toda, T" uniqKey="Toda T">T Toda</name>
</author>
<author>
<name sortKey="Nakamura, K" uniqKey="Nakamura K">K Nakamura</name>
</author>
<author>
<name sortKey="Saruwatari, H" uniqKey="Saruwatari H">H Saruwatari</name>
</author>
<author>
<name sortKey="Shikano, K" uniqKey="Shikano K">K Shikano</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gales, Mjf" uniqKey="Gales M">MJF Gales</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kanungo, T" uniqKey="Kanungo T">T Kanungo</name>
</author>
<author>
<name sortKey="Mount, D" uniqKey="Mount D">D Mount</name>
</author>
<author>
<name sortKey="Netanyahu, N" uniqKey="Netanyahu N">N Netanyahu</name>
</author>
<author>
<name sortKey="Piatko, C" uniqKey="Piatko C">C Piatko</name>
</author>
<author>
<name sortKey="Silverman, R" uniqKey="Silverman R">R Silverman</name>
</author>
<author>
<name sortKey="Wu, A" uniqKey="Wu A">A Wu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kumar, N" uniqKey="Kumar N">N Kumar</name>
</author>
<author>
<name sortKey="Andreou, A" uniqKey="Andreou A">A Andreou</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lee, Kf" uniqKey="Lee K">KF Lee</name>
</author>
<author>
<name sortKey="Hon, Hw" uniqKey="Hon H">HW Hon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, H" uniqKey="Liu H">H Liu</name>
</author>
<author>
<name sortKey="Zhao, Q" uniqKey="Zhao Q">Q Zhao</name>
</author>
<author>
<name sortKey="Wan, M" uniqKey="Wan M">M Wan</name>
</author>
<author>
<name sortKey="Wang, S" uniqKey="Wang S">S Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mantilla Caeiros, A" uniqKey="Mantilla Caeiros A">A Mantilla-Caeiros</name>
</author>
<author>
<name sortKey="Nakano Miyatake, M" uniqKey="Nakano Miyatake M">M Nakano-Miyatake</name>
</author>
<author>
<name sortKey="Perez Meana, H" uniqKey="Perez Meana H">H Perez-Meana</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Matui, K" uniqKey="Matui K">K Matui</name>
</author>
<author>
<name sortKey="Hara, N" uniqKey="Hara N">N Hara</name>
</author>
<author>
<name sortKey="Kobayashi, N" uniqKey="Kobayashi N">N Kobayashi</name>
</author>
<author>
<name sortKey="Hirose, H" uniqKey="Hirose H">H Hirose</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pravena, D" uniqKey="Pravena D">D Pravena</name>
</author>
<author>
<name sortKey="Dhivya, S" uniqKey="Dhivya S">S Dhivya</name>
</author>
<author>
<name sortKey="Durga Devi, A" uniqKey="Durga Devi A">A Durga Devi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Qi, Y" uniqKey="Qi Y">Y Qi</name>
</author>
<author>
<name sortKey="Weinberg, B" uniqKey="Weinberg B">B Weinberg</name>
</author>
<author>
<name sortKey="Bi, N" uniqKey="Bi N">N Bi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rabiner, Lr" uniqKey="Rabiner L">LR Rabiner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sharifzadeh, Hr" uniqKey="Sharifzadeh H">HR Sharifzadeh</name>
</author>
<author>
<name sortKey="Mcloughlin, Iv" uniqKey="Mcloughlin I">IV McLoughlin</name>
</author>
<author>
<name sortKey="Ahmadi, F" uniqKey="Ahmadi F">F Ahmadi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stylianou, Y" uniqKey="Stylianou Y">Y Stylianou</name>
</author>
<author>
<name sortKey="Cappe, O" uniqKey="Cappe O">O Cappé</name>
</author>
<author>
<name sortKey="Moulines, E" uniqKey="Moulines E">E Moulines</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tanaka, K" uniqKey="Tanaka K">K Tanaka</name>
</author>
<author>
<name sortKey="Toda, T" uniqKey="Toda T">T Toda</name>
</author>
<author>
<name sortKey="Neubig, G" uniqKey="Neubig G">G Neubig</name>
</author>
<author>
<name sortKey="Sakti, S" uniqKey="Sakti S">S Sakti</name>
</author>
<author>
<name sortKey="Nakamura, S" uniqKey="Nakamura S">S Nakamura</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Toda, T" uniqKey="Toda T">T Toda</name>
</author>
<author>
<name sortKey="Black, W" uniqKey="Black W">W Black</name>
</author>
<author>
<name sortKey="Tokuda, K" uniqKey="Tokuda K">K Tokuda</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turkmen, H" uniqKey="Turkmen H">H Türkmen</name>
</author>
<author>
<name sortKey="Karsligil, M" uniqKey="Karsligil M">M Karsligil</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wuyts, L" uniqKey="Wuyts L">L Wuyts</name>
</author>
<author>
<name sortKey="De Bodt, Ms" uniqKey="De Bodt M">MS De Bodt</name>
</author>
<author>
<name sortKey="Molenberghs, G" uniqKey="Molenberghs G">G Molenberghs</name>
</author>
<author>
<name sortKey="Remacle, M" uniqKey="Remacle M">M Remacle</name>
</author>
<author>
<name sortKey="Heylen, L" uniqKey="Heylen L">L Heylen</name>
</author>
<author>
<name sortKey="Millet, B" uniqKey="Millet B">B Millet</name>
</author>
<author>
<name sortKey="Van Lierde, K" uniqKey="Van Lierde K">K Van Lierde</name>
</author>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author>
<name sortKey="Van De Heyning, Ph" uniqKey="Van De Heyning P">PH Van de Heyning</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yu, P" uniqKey="Yu P">P Yu</name>
</author>
<author>
<name sortKey="Ouakine, M" uniqKey="Ouakine M">M Ouakine</name>
</author>
<author>
<name sortKey="Revis, J" uniqKey="Revis J">J Revis</name>
</author>
<author>
<name sortKey="Giovanni, A" uniqKey="Giovanni A">A Giovanni</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
<li>Maroc</li>
</country>
<region>
<li>Grand Est</li>
<li>Lorraine (région)</li>
<li>Rabat-Salé-Kénitra</li>
</region>
<settlement>
<li>Nancy</li>
<li>Rabat</li>
<li>Vandœuvre-lès-Nancy</li>
</settlement>
</list>
<tree>
<country name="Maroc">
<region name="Rabat-Salé-Kénitra">
<name sortKey="Lachhab, Othman" sort="Lachhab, Othman" uniqKey="Lachhab O" first="Othman" last="Lachhab">Othman Lachhab</name>
</region>
<name sortKey="Elhaj, Elhassane Ibn" sort="Elhaj, Elhassane Ibn" uniqKey="Elhaj E" first="Elhassane Ibn" last="Elhaj">Elhassane Ibn Elhaj</name>
<name sortKey="Hammouch, Ahmed" sort="Hammouch, Ahmed" uniqKey="Hammouch A" first="Ahmed" last="Hammouch">Ahmed Hammouch</name>
</country>
<country name="France">
<region name="Grand Est">
<name sortKey="Di Martino, Joseph" sort="Di Martino, Joseph" uniqKey="Di Martino J" first="Joseph" last="Di Martino">Joseph Di Martino</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000771 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000771 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:4627987
   |texte=   A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:26543778" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a InforLorV4 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022